Resource Constrained Data Stream Clustering with Concept Drifting for Processing Sensor Data
نویسندگان
چکیده
Wireless sensors and mobile devices have been widely deployed as data collecting devices for monitoring real world systems. A large amount of stream data is generated in real-time, which has to be processed in real-time as well. One of the common processing operations is clustering that automatically groups the elements of a stream into a number of clusters in general. Elements of the same cluster have maximum similarity and elements of different clusters have minimum similarity. This paper proposes an on-demand framework (SRAStream) based on the concept drifting detection mechanism. The concept drifting detection algorithm is used to measure the distance of the new clusters for the current data and that of the existing clusters. Only when a concept drifting occurs will the re-clustering be performed to identify new clusters. SRAStream thus avoids the unnecessary computation intensive re-clustering calculation. Experiments suggest that the proposed framework does work well and improve the processing speed greatly in data streams clustering. Resource Constrained Data Stream Clustering with Concept Drifting for Processing Sensor Data
منابع مشابه
Granularity Adaptive Density Estimation and on Demand Clustering of Concept-Drifting Data Streams
Clustering data streams has found a few important applications. While many previous studies focus on clustering objects arriving in a data stream, in this paper, we consider the novel problem of on demand clustering concept drifting data streams. In order to characterize concept drifting data streams, we propose an effective method to estimate densities of data streams. One unique feature of ou...
متن کاملDistributed Weighted Clustering of Evolving Sensor Data Streams with Noise
Collecting data from sensor nodes is the ultimate goal of Wireless Sensor Networks. This is performed by transmitting the sensed measurements to some data collecting station. In sensor nodes, radio communication is the dominating consumer of the energy resources which are usually limited. Summarizing the sensed data internally on sensor nodes and sending only the summaries will considerably sav...
متن کاملIncremental Clustering for the Classification of Concept-Drifting Data Streams
Concept drift is a common phenomenon in streaming data environments and constitutes an interesting challenge for researchers in the machine learning and data mining community. This paper proposes a probabilistic representation model for data stream classification and investigates the use of incremental clustering algorithms in order to identify and adapt to concept drift. An experimental study ...
متن کاملResource-aware High Quality Clustering in Ubiquitous Data Streams
Data stream mining has attracted much research attention from the data mining community. With the advance of wireless networks and mobile devices, the concept of ubiquitous data mining has been proposed. However, mobile devices are resource-constrained, which makes data stream mining a greater challenge. In this paper, we propose the RA-HCluster algorithm that can be used in mobile devices for ...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJDWM
دوره 11 شماره
صفحات -
تاریخ انتشار 2015